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; 1, L0608: 1. 
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H0624: 10, S0192: 8, 
S0022: 7, S0212: 5, 
H0031:4,H0412: 4, 
S0028: 4, L0747: 4, 
S0194: 4, H0717: 3, 
H0427: 3, L0471: 3, 
S0250: 3, H0039: 3, 
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Lys-1 to Glu-6, 
Pro-8 to Gln-21, 
Ser-43 to Glu-51, 
Val-61 to Gly-68, 
Arg-75 to Pro-87, 
Ser-92 to Phe-98. 


Lys-13 to Ser-41, 
Lys-57 to Lys-66, 
Lys-89 to Lys-97, 
Lys-113toAla-118, 
Pro-131toAla-147, 
Ala-159toAla-169. 


Lys-24toLys-31, 
Gly-132toGlu-137. 
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: 1, L0366: 1, 
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^ «n ui -rf CO CO cn <s* of c4" of r«i* -T ^ -T 

o K w 4 ti: a K J w a w ffi 3 g g g § 

^ ctT CO to" CO <^ cf (sf M c4" c^r ^* 










Ala-24toLys-29, 
Lys-51 to Gly-62. 




3508 


3509 




155 - 430 


2-436 




1333 


1334 




HNORDllR 


HNORE65R 




HNORDU 


HNORE65 



1169 



wo 02/00677 



PCT/USOl/18569 











... c 


B 
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H0575: 1.H0706 
H0052: 1, H0309 
H0596: 1, H0546: 
H0545: 1,H0086: 
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r0023: 1,L0143: 
H0551: 1, H0494: 
S0438: 1, S0002: 
L0371: 1, L0770: 
L0769: 1,L0667: 
L0372: 1. L0646: 
L0553: 1,L0771: 
L0774: 1,L0375: 
L0806: 1, L0382: 
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L0784: 2, H0428: 1, 
H0644: 1, S0366: 1, 
L0659: 1. L0636: 1, 
L0789: 1 and H0658: 1. 


L0766: 1 and H0658: 
2. 




\: 4,H0351:3, 
2, H0520: 2, 
2,H0657: 1, 
1,S0358: 1, 
1, L0021: 1, 
1,H0328: 1, 
1, H0616: 1, 
1, S0438: 1, 
1, S0150: 1, 
l,H064l: 1. 
I, L0768: 1, 
1. L0776: 1, 
1, H0660: 1, 
1, S0380; 1, 
1.L0747: 1, 
1, H0136: 1, 
1.H0422: land 
I. 












Gln-15toArg-20, 
Arg-35 to Ser-40. 


ne-5 to Lys-20. 


Asp-28 to His-35, 
Phe-43 to Trp-48, 
Asn-81 to Arg-86. 
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L0770: 4, L0771: 4, 
L0769: 3, L0757: 3, 
L0766:2,L0779:2, 
L0758: 2, H0170: 1, 
H0686: 1. S0476: 1, 
H0586: 1, L0637: 1, 
L0761: 1, L0803: 1, 
L0774: 1, L0776: 1, 
Luojy. 1, LUouy: i, 
L0791: 1,H0658: 1, 
H0696: 1, S3012: 1, 
S0390: 1,L0747: 1, 
L0752: 1. L0753: 1, 
L0759: 1, S0434: 1, 
L0592: 1 and H0543: 1. 




L0439: 4, H0658: 2, 
S0474: 1, L0794: 1. 
L0804: 1, L0791: 1, 
L0755: 1 and L0599: 1. 


















Asp-17 to Asn-30. 






Lys-32 to Glu-39, 
Glu-43 to Ala-50, 
AIa-52 to Glu-80. 
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: 1 and 










H0658: 2,L0745: 2, 
S0422: 1, H0648: 1 and 
L0731: L 
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? ^ OO ^ 

fS^ io 
o m 
3 2 o o 

ON _ r ^ * , 
- <^ <S CM -H 

OO 

Xj- ^ VO T-« O 

CO cvi 00 a\ 
CD cn in 


H0542:11,H0265: 7, 
i0543:7,L0751:5, 
50424: 5, H0556: 4. 


(L0542: 1, L0647; 
L0787: 1, H0658 
H0670: 1, H0648 
H0539: 1, H0710 
S0332: 1, H0478; 
H0627: 1, L0755: 
S0436: 1, L0605: 
L0362: 1, H0667: 
S0276: 1, H0543: 
H0422: 1, H0506: 
H0352: 1. 




Gln-121 to Glu-136, 
AIa-153 to Ser-162, 
Ala-176 to Ser-182. 






Phe-25toArg-34, 
Cys-51 to His-56, 
GIu-66 to Ser-75. 




His-2 to Trp-13, 
Lys-22toCys-31. 


GIy.l7 to Glu-26, 
Lys-32 to Leu-37, I 
Phe-46toThr-52, J 
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A.R104: 42,AR089: 
22,AR060: 7,AR096: 
4,AR055: 2,AR061: 






















H0031:2,H0135: 1, 
i0272: 1 andH0658: 1. 














Asn-20 to Asp-25, 
Met-75toAsp-81, 
Arg-144toThr-152. 


Lys-1 to Leu-23, 
Ala-27 to Ser-34, 
Gly-36toPhe-44. 








Leu-11 to Cys-24, 
Leu-30 to Ser-36, 
Gly-93 to Glu-99. 




Ser-120 to Tyr-126, 
Gly-137 to Arg-142, 
Ser-160toSer-168. 
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S0132: 1, H0156: 1, 
H0575: 1,H0551: 1, 
H0658: 1.H0670: 1. 
H0555: 1 and L0439: 1. 




L0766: 11,L07 
L0754: 5, H0547 
S0358: 3, S0022: 
L0756:3,L0777 
H0625: 2, L0065 
S0438: 2, L0805: 
L0776: 2, H0658 
S0328: 2, L0740: 
L0745: 2, L0747: 
L0750: 2,L0731: 
L0601:2,H0542 
H0543: 2, H0423 
S0424: 2, H0341: 
H0661:1.H0664 
H0306: 1, H0402 
H0638: 1, S0360: 
H0580: 1. H0645: 
H0443: 1, H0357: 
^0592: 1, H0587; 
i0574: 1. H0486: 
i0270: 1, H0156: 
r0048: 1, H0052: 


Cys-27 to Gly-43, 
Ala-46 to Trp-54, 
AIa-56 to Arg-68, 
Phe-83toArg-93. 


Thr-4toHis-17, 
Ser-21 to Gln-27. 


Thr-57 to Phe-62, 
Gly-68 to Phe-73, 
His-86toTyr-92, 
Asp-97 to Phe-103. 
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L0748: 3, H0556: 2, 
S0420: 2, S0045: 2, 
a0171: 1, H0265: 1, 
50134: 1, H0657: 1, 


iiigis§iiiiiiii§iiii 


iiiiiilliliiiiliiiiii 




Gly-8toVal-14, 
Gln-20toAsp-27, 
Met-79 to His-88, 
Arg-171toTyr-183, 
Pro-198 to Gly-204. 
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Gly-6 to Ser-16. 
Glu-19toAla-33, 
Glu-44 to Lys-60, 
Ile-135 to Lys-147. 


Ala-3toLys-ll, 
Gly-112toGlu-117. 




Lys-94toSer-100. t 


Lys-32 to Glu-38, 
Ser-44 to Ser-55, 
Gln-67 to Gly-78, 
Glu-85toGIu-90, 
Gly-108toSer-114, 
Phe-149toLys-158. 
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;1,T0041: 1. 
1, S0142: 1, 
1,H0695: 1, 
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[0040] The first column in Table 1 provides a unique "Clone ID NO:Z" for a cDNA 

clone related to each contig sequence disclosed in Table 1. This clone ID references the 
cDNA clone which contains at least the 5' most sequence of the assembled contig, and at 
least a portion of SEQ ID NO:X was determined by directly sequencing the referenced 
clone. The reference clone may have more sequence than described in the sequence listing 
or the clone may have less. In the vast majority of cases, however, the clone is believed to 
encode a full-length polypeptide. In the case where a clone is not foil-length, a full-length 
cDNA can be obtained by methods known in the art and/or as described elsewhere herein. 
[0041] The second column in Table 1 provides a unique "Contig ID" identification 

for each contig sequence. The third column provides the "SEQ ID NO:X" identifier for 
each of the ovarian associated contig polynucleotide sequences disclosed in Table 1. The 
fourth column, "ORF (From-To)", provides the location (i.e., nucleotide position numbers) 
within the polynucleotide sequence "SEQ ID NO:X" that delineate the preferred open 
readmg fi:ame (ORF) shown in the sequence Usting and referenced in Table 1, column 5, as 
SEQ ID NO:Y. Where the nucleotide position number "To" is lower than the nucleotide 
position number "From", the preferred ORF is the reverse complement of the referenced 
polynucleotide sequence, 

[0042] The fifth column m Table 1 provides the corresponding SEQ ID NO:Y for 

the polypeptide sequence encoded by the preferred ORF delineated in column 4. In one 
embodiment, the invention provides an amino acid sequence comprising, or alternatively 
consisting of, a polypeptide encoded by the portion of SEQ ID NO:X delineated by "ORF 
(From-To)". Also provided are polynucleotides encoding such amino acid sequences and 
the complementary strand thereto. 

[0043] Column 6 in Table 1 lists residues comprising epitopes contained in the 

polypeptides encoded by the preferred ORF (SEQ ID NO:Y), as predicted using the 
algorithm of Jameson and Wolf, (1988) Comp. Appl. Biosci. 4:181-186. The Jameson- 
Wolf antigenic analysis was performed using the computer program PROTEAN (Version 
3.11 for the Power Macintosh, DNASTAR, Inc., 1228 South Park Street Madison, WI). In 
specific embodiments, polypeptides of the mvention comprise, or alternatively consist of, at 
least one, two, three, four, five or more of the predicted epitopes as described in Table 1. It 
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will be appreciated that depending on the analytical criteria used to predict antigenic 
detenninants,&e exact address offliedetenninant may vaiy slightly. 
[0044] <:k)lunm 7 in Table 1 provides an ejqjression profile and library code: count 

for each of the contig sequences (SEQ ID NO:X) disclosed in Table 1, which can routinely 
be combined with the information provided in Table 4 and used to determine the normal or 
diseased tissues, cells, and/or cell line Ubraries which predominantly express the 
polynucleotides of the mvention. The first number in column 7 (preceding the colon), 
represents the tissue/ceU source identifier code corresponding to the code and description 
provided in Table 4. For those identifier codes in which tiie first two letters are not "AR", 
the second number in column 7 (following tiie colon) represents the number of times a 
sequence corresponding to the reference polynucleotide sequence was identified in the 
tissue/cell source. Those tissue/ceU source identifier codes in which flie first two letters are 
"AR" designate information generated using DNA array technology. Utilizing tiiis 
technology, cDNAs were amplified by PGR and then transferred, in dupUcate, onto the 
array. Gene expression was assayed through hybridization of first strand cDNA probes to 
the DNA array. cDNA probes were generated fi?om total RNA extracted from a variety of 
different tissues and cell lines. Probe syntiiesis was performed in tiie presence of ''P dCTP, 
using oligo(d'I) to prime reverse transoiption. After hybridization, high stringency washing 
conditions were employed to remove non-specific hybrids from tiie array. The remaining 
signal, emanating fi»m each gene target, was measured usii^ a Phosphoriniag«. Gene 
expression was reported as Phosphor Stimulating Luminescence (PSL) which reflects flie 
level of phosphor signal generated fiom tiie probe hybridized to each of tiie gene targets 
represaited on tiie array. A local background signal subtraction was performed before the 
total signal generated firom each array was used to normalize gene egression between tiie 
diffwent hybridizations. The value presented after "[array code]:" rq)r^ents flie mean of 
flie dupUcate values, following background subtiaction and probe normalization. One of 
skUl in tiie art could routinely use tiiis information to identify normal and/or diseased 
tissue(s) which show a predominant expression pattem of the coiresponding polynucleotide 
of flie invention or to identify polynucleotides which show predominant and/or specific 
tissue and/or ceU expression. ITie sequences disclosed herein have been determined to be 
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predominantly expressed in ovarian tissues, including normal and diseased ovarian tissues 
(See Table 1, column 7 and Table 4). 



polynucleotides of the invention. Chromosomal location was determined by finding exact 
matches to EST and cDNA $equences contained in the NCBI (National Center for 
Biotechnology Information) UniGene database. Each sequence in the UniGene database is 
assigned to a "cluster"; all of the ESTs, cDNAs, and STSs in a cluster are believed to be 
derived from a single gene. Chromosomal mapping data is often available for one or more 
sequence(s) in a UniGene cluster; this data (if consistent) is then applied to the cluster as a 
whole. Thus, it is possible to infer the chromosomal location of a new polynucleotide 
sequence by determining its identity with a mapped UniGene cluster, 
100461 A modified version of the computer program BLASTN (Altshul et al., J. 

MoL Biol. 215:403-410 (1990), and Gish et al., Nat. Genet. 3:266-272 (1993)) was used to 
search the UniGene database for EST or cDNA sequences that contain exact or near-exact 
matches to a polynucleotide sequence of the invention (the 'Query'). A sequence from the 
UniGene database (the 'Subject') was said to be an exact match if it contained a segment of 
50 nucleotides in length such that 48 of those nucleotides were in the same order as found 
in the Query sequence. If all of the matches that met this criteria were in the same UniGene 
cluster, and mapping data was available for this cluster, it is indicated in Table 1 under the 
heading "Cytologic Band". Where a cluster had been further localized to a distinct cytologic 
band, that band is disclosed; where no bandmg information was available, but the gene had 
been localized to a single chromosome, the chromosome is disclosed. 
[0047] Once a presumptive chromosomal location was determined for a 

polynucleotide of the invention, an associated disease locus was identified by comparison 
with a database of diseases which have been experimentally associated with genetic loci. 
The database used was the Morbid Map, derived from OMIM™ (supra). If the putative 
chromosomal location of a polynucleotide of the invention (Query sequence) was 
associated with a disease in the Morbid Map database, an OMIM reference identification 
number was noted in column 9, Table 1, labeled "OMIM Disease Reference(s)". Table 5 is 
a key to the OMIM reference identification numbers (column 1), and provides a description 
of the associated disease in Column 2. 

1482 



[0045] 



Column 8 in Table 1 provides a chromosomal map location for certain 
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[0048] Table 2 furdier characterizes certain encoded polypeptides of the 

invaition, by providing the results of comparisons to protein and protein family databases. 
The first column provides a unique clone identifier, "Clone ID NO:", corresponding to a 
cDNA clone disclosed in Table I. The second column provides the unique contig 
indentifiar, "Contig ID:" which allows conelation with the information in Table 1. The 
third column provides the sequraice identifier, "SEQ ID NO:X", for the contig 
polynucleotide sequences. The fourtii column provides the analysis method by which the 
homology/identity disclosed in the row was determmed. The fifth column provides a 
desCTiption of PFam/NR hits having significant matches identified by each analysis. 
Column six provides the accession number of flie PPam/NR hit disclosed in the fifth 
column. Column seven. "Score/Percent Identity", provides a quaUty score or the percent 
identity, of flie hit disclosed in column five. Conq)arisons were made between 
polypeptides encoded by polynucleotides of flie mvention and a non-redundant protein 
database (herein referred to as "MR"), or a database of protein femihes (herem referred to 
as "PFam"), as described below. 

[00491 The NR database, which comprises the NBRF FIR database, flie NCBI 

GenPept database, and flie SIB SwissProt and TrEMBL databases, was made non- 
redundant using flie computer program nrdb2 (Warren Gish, Washington University in 
Saint Louis). Each of flie polynucleotides shovm in Table 1, column 3 (e.g., SEQ ID 
NOiX or flie 'Query* sequence) was used to search agamst flie NR database. The computer 
program BLASTX was used to conqiare a 6-frame tianslation of flie Query sequence to 
flie NR database (for information about flie BLASTX algoriflun please see Altshul et al., J. 
Mol. Biol. 215:403-410 (1990), and Gish et al., Nat Genet 3:266-272 (1993)). A 
description of flie sequence fliat is most similar to flie Query sequence (flie highest scoring 
'Subject') is shown in column five of Table 2 and flie database accession number for tiiat 
sequence is provided in column six. The highest scoring 'Subject' is reported in Table 2 if 
(a) flie estimated probability fliat flie match occuned by chance alone is less flian l.Oe-07, 
and (b) flie match vras not to a known repetitive element BLASTX returns aUgnments of 
short polypeptide segments of flie Query and Subject sequences whidi share a high degree 
of similarity, fliese segments are known as High-Scoring Segment Pairs or HSPs. Table 2 
reports flie degree of similarity between flie Query and ttie Subject for each HSP as a 
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percent identity in Column 7. The percent identity is determined by dividing the number 
of exact matches between the two aligned sequences in the HSP, dividing by the number 
of Query amino acids in the HSP and multiplying by 100. The polynucleotides of SEQ ID 
NO:X which encode the polypeptide sequence that generates an HSP are delineated by 
columns 8 and 9 of Table 2. 

[00501 The PFam database, PFam version 5.2, (Sonnhammer et al., NucL Acids 

Res., 26:320-322, (1998)) consists of a series of multiple sequence alignments; one 
alignment for eadi protein family. Each multiple sequence alignment is converted into a 
probability model called a Hidden Markov Model, or HMM, that represents the position- 
specific variation among the sequences ttiat make up the multiple sequence alignment 
(see, e.g., R. Durbin et aL, Biological sequence analysis: probabilistic models of proteins 
and nucleic acids, Cambridge University Press, 1998 for the theory of HMMs). The 
program HMMER version 1.8 (Sean Eddy, Washington University m Saint Louis) was 
used to compare the predicted protein sequence for each Qujsry sequence (SEQ ID NO:Y 
in Table 1) to each of the HMMs derived jfrom PFam version 5.2. A HMM derived fi^om 
PFam version 5.2 was said to be a significant match to a polypeptide of the invention if 
the score returned by HMMER 1.8 was greater than 0.8 times the HMMER 1.8 score 
obtained with the most distantly related known member of that protein family. The 
description of the PFam femily which shares a significant match with a polypeptide of the 
invention is listed in column 5 of Table 2, and the database accession number of the PFam 
hit is provided in column 6. Column 7 provides flie score returned by HMMER version 
1.8 for the alignment. Columns 8 and 9 delmeate the polynucleotides of SEQ ID NO 
vMch encode the polypeptide sequence which shows a significant match to a PFam 
protein family. 

I0051J As mentioned, columns 8 and 9 in Table 2, **NT From" and 'OT To", 

delineate the polynucleotides of "SEQ ID NO:X" that encode a polypeptide having a 
significant match to the PFam/NR database as disclosed in the fifth column of Table 2. Jn 
one enAodiment, the invention provides a protein comprising, or alternatively consistmg 
of; a polypq)tide encoded by the polynucleotides of SEQ ID NO:X delineated in columns 
8 and 9 of Table 2. Also provided are polynucleotides weeding such proteins, and the 
complementary strand thereto. 
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[00521 The nucleotide sequence SEQ ID NO:X and the translated SEQ ID NO:Y 

are sufficiently accurate and otherwise suitable for a variety of uses well known in the art 
and described further below. For instance, the nucleotide sequences of SEQ ID NO:X are 
useful for designing nucleic acid hybridization probes that, will detect nucleic acid 
sequences contained in SEQ ID NO:X or the cDNA contained in Clone ID NO:Z. These 
probes will also hybridize to nucleic acid molecules' in biological samples, thereby 
enabling immediate applications in chromosome mapping, linkage analysis, tissue 
identification and/or typing, and a variety of forensic and diagnostic methods of the 
invention. Sunilarly, polypeptides identified IBrom SEQ ID NO:Y may be used to generate 
antibodies which bind specifically to these polypeptides, or Augments thereof, and/or to 
the polypeptides encoded by the cDNA clones identified in, for example. Table 1. 
[0053] Nevertheless, DNA sequences generated by sequencing reactions can 

contain sequencing errors. The errors exist as misidentified nucleotides, or as insertions or 
deletions of nucleotides in the generated DNA sequence. The erroneously inserted or 
deleted nucleotides cause frame shifts in the reading fi:ames of the predicted amino acid 
sequence. . In these cases, the predicted amino acid sequence diverges fi:om the actual 
amino acid sequence, even though the generated DNA sequence may be greater than 
99.9% identical to the actual DNA sequence (for example, one base insertion or deletion 
in an open reading firame of over 1000 bases). 

[0054] Accordingly, for those applications requiring precision m the nucleotide 

sequence or the amino acid sequence, the present invention provides not only the 
generated nucleotide sequence identified .as SEQ ID NO:X, and a predicted translated 
amino acid sequence identified as SEQ ID NO:Y, but also a sample of plasmid DNA 
contaming cDNA Clone ID NO:Z (dqjosited with the ATCC on June 5, 2000 and were 
given ATCC Deposit Nos. PTA-1982 and PTA-1985; and/or as set fwth, for example, in 
Table 1, 6 and 7). The nucleotide sequence of each deposited clone can readily be 
determined by sequencing the dqposited clone in accordance with known methods. 
Further, techniques known in the art can be used to verify the nucleotide sequences of 
SEQ ID NO:X.niques known in the art can be used to verify the nucleotide sequences of 
SEQIDNOrX. 
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100551 The predicted amino acid sequence can then be verified fi-om such deposits. 

Moreover, the amino acid sequence of the protein encoded by a particular clone can also 
be directly determined by pqptide sequencing or by expressing the protein in a suitable 
host cell containing the deposited human cDNA, collecting the protein, and determming 
its sequence. 

RACE Protocol For Recovery of Full-Length Genes 

[OOSdl Partial cDNA clones can be made fiill-lengtti by utilizing the rapid 

amplification of cDNA ends (RACE) procedure described in Frohman, M.A, et aL, Proc. 
Nafl. Acad. Sci. USA, 85:8998-9002 (1988). A cDNA clone missing either the 5' or 3' 
end can be reconstructed to include the absent base pairs extending to the translation^ 
start or stop codon, respectively. In some cases, cDNAs are missmg the start codon of 
translation. The following briefly describes a modification of this original 5' RACE 
procedure. Poly A+ or total RNA is reverse transcribed with Siq>©cscript n (GSbco/BRL) 
and an antisense or complementary primer specific to the cDNA sequence. The primer is 
removed from the reaction with a Microcon Concentrator (Amicon), The first-strand 
cDNA is then tailed with dATP and terminal deoxynucleotide transferase (Gibco/BRL). 
nius, an anchor sequence is produced which is needed for PCR amplification. The 
second strand is synthesized &om the dA-tail in PCR buffer, Taq DNA polymerase (Per- 
km-Ehner Cetus), an oligo-dT primer containing three adjacent restriction sites (Xhol, 
Sail and Clal) at the 5* end and a primer containing just these restriction sites. This 
double-stranded cDNA is PCR amplified for 40 cycles with the same primers as well as a 
nested cDNA-specific antisense primer. The PCR products are size-separated on an 
ethidium bromide-agarose gel and the region of gel containing cDNA products flie 
predicted size of missing protein-coding DNA is removed. cDNA is purified fix)m flie - 
agarose with liie Ma^c PCR Pr^ kit (Prome^), restriction digested with Xhol or Sail, 
and ligated to a plasmid such as pBluescript SKU (Stratagene) at Xhol and EcoRV sites. 
This DNA is transformed mto bactcda and the plasmid clones sequenced to identify the 
correct protein-KX)ding inserts. Correct 5* ends are confirmed by comparing this sequence 
with the putatively idmtified homologue and overlap with the partial cDNA clone. Similar 
methods known in the art and/or commercial kits are used to mxphfy and recover 3' ends. 
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10057) Several quality-controlled kits are commercially available for purchase. 

Similar reagente and methods to those above are supplied in kit form from Gibco/BRL for 
both 5' and 3* RACE for recovery of full length genes. A second kit is available from 
Clontech which is a modification of a related technique, SUC (single-stranded ligation to 
single-stranded cDNA), developed by Dumas et al.. Nucleic Acids Res., 19:5227-32 
(1991). The major differences in procedure are that the RNA is alkaline hydrolyzed after - 
reverse transcription and RNA Ugase is used to join a restriction site-containing anchor 
primer to the first-strand cDNA. This obviates the necessity for the dA-tailing reaction 
which results in a polyT stretch that is difficult to sequence past. 

[0058] An alternative to generating 5' or 3* cDNA torn RNA is to use cDNA 

library double-stranded DNA An asymmetric PCR-ampUfied antisense cDNA strand is 
synthesized with an antisense cDNA-specific primer and a plasmid-anchored primer. 
These primers are removed and a syitunetric PGR reaction is performed with a nested 
cDNA-specific antisense primer and the plasmid-anchored primer. 

RNA Ligase Protocol For Generating The J' or 3' End Sequences To Obtain Full Length 
Genes 

[0059] Once a gene of interest is identified, several methods are available for the 

identification of the 5* or 3* portioiis of the gene which may not be present in the original 
cDNA plasmid. These methods include, but are not limited to, filter probmg, clone 
enrichment using specific probes and protocols similar and identical to 5* and 3' RACE. 
While the full length gene may be presmt in the library and can be identified by probing, a 
useful method for generating the 5' or 3' end is to use the existing sequence information 
&om the original cDNA to generate the missing information. A method similar to 5' 
RACE is available for generating the missmg 5' end of a desu^d full-length gene. (This 
method was pubUshed-by Fromont-Racine et al.. Nucleic Acids Res., 21(7):1683-1684 
(1993)), Briefly, a specific RNA oligonucleotide is ligated to the 5* ends of a population 
of RNA presumably contaming fiill-length gene RNA transcript. A primer set containing 
a primer specific to the ligated RNA oligonucleotide and a primer specific to a known 
sequence of the gene of interest, is used to PGR amplify the 5' portion of ttie desired full 
length gene which may then be sequenced and used to generate the fiiU length gene. This 
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method starts with total RNA isolated from the desired source, poly A RNA may be used 
but is not a prerequisite for this procedure. The RNA preparation may then be treated with 
phosphatase if necessary to eliminate 5' phosphate groups on degraded or damaged RNA 
which may interfere with the later RNA Ugase step. The phosphatase if used is then 
inactivated and the RNA is treated with tobacco acid pyrophosphatase in order to remove 
the cap structure present at the 5' ends of messenger RNAs. This reaction leaves a 5' 
phosphate group at the 5' end of the cap cleaved RNA which can then be ligated to an 
RNA oUgonucleotide using T4 RNA ligase. This modified RNA preparation can then be 
used as a template for first strand cDNA synthesis using a gene specific oligonucleotide. 
The first strand synthesis reaction can then be used as a template for PGR ampUfication of 
the desired 5' end using a primer specific to the ligated RNA oligonucleotide and a primer 
specific to the known sequence of the ovarian antigen of interest. The resultant product is 
then sequenced and analyzed to confirm that the 5' end sequence belongs to the relevant 
ovarian antigen. 

[0060] The present invention aJso relates to vectors or plasmids, which include 

such DNA sequences, as well as the use of the DNA sequences. The material deposited 
with the ATCC (deposited with the ATCC on June 5, 2000 and were given ATCC Deposit 
Nos. PTA-1982 and PTA-1985; and/or as set forth, for example, in Table 1, 6 and 7) is a 
mixture of cDNA clones derived from a variety of human tissue and cloned m either a 
plasmid vector or a phage vector, as shown, for example, m Table 7. These deposits are 
referred to as "the deposits" herein. The tissues from which some of the clones were 
derived are listed in Table 7, and the vector in which the correspondmg cDNA is 
contained is also, indicated in Table 7. The deposited material includes cDNA clones 
corresponding to SBQ ID NO:X described, for example, in Table 1 (Clone ID NO:Z). A 
clone which is isolatable from the ATCC Deposits by use of a sequence listed as SEQ ID 
NO:X, may include the entire coding region of a human gene or m other cases such clone 
may include a substantial portion of the coding region of a human gene. Furthermore, 
although the sequence Usting may in some instances Ust only a portion of the DNA 
sequence m a clone included in the ATCC Deposits, it is weU withm the ability of one 
sldUed in the art to sequence the DNA included in a clone contained in the ATCC 
Dq)osits by use of a sequence (or portion thereof) described m, for example Tables I A or 
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2 by procedures hereinafter further described, and others ^parent to those skilled in the 
art. 

[0061] Also provided in Table 7 is the name of the vector which contains the 

cDNA clone. Each vector is routinely used in the art. The following additional 
information is provided for convenience. 

[0062] Vectors Lambda Zap (U.S. Patent Nos. 5,128,256 and 5,286,636), Uni-Zap 

XR (U.S. Patent Nos. 5,128,256 and 5,286,636), Zap Express (U.S. Patent Nos. 5,128,256 
and 5,286,636), pBluescript (pBS) (Short, J. M. et al. Nucleic Acids Res. /5;7583-7600 
(1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 77:9494 (1989)) and 
pBK (Alting-Mees, M. A. et ai.. Strategies 5.-58-61 (1992)) are commercially available 
6om Stratagene Cloning Systems, hic, 1 101 1 N. Torrey Pines Road, La JoUa, CA, 92037. 
pBS contams an ampicillin resistance gene and pBK contains a neomycin resistance gene. 
Phagemid pBS may be excised from tfie Lambda Zap and Uni-Zap XR vectors, and 
phagemid pBK may be excised from the Zap Express vector. Both phagemids may be 
traxisfonned into E. coli stram XL-1 Blue, also available from Stratagene. 
[00631 Vectors pSportl, pCMVSport 1.0, pCMVSport 2.0 and pCMVSport 3.0, 

were obtained from Life Technologies, Inc., P. O. Box 6009, Gaithersburg, MD 20897. 
All Sport vectors contain an ampicillin resistance gene and may be transformed into E. 
coli strain DHIOB, also available from Life Technologies. See, for instance, Gruber. C. 
E., et al., Focus 15:59- (1993). Vector lafinid BA (Bento Soares, Columbia University, 
New York, NY) contains an ampicillin resistance gene and can be tiansformed into E. coli 
strain XL-1 Blue. Vector pCR®2.1, which is available from Invitrogen, 1600 Faraday 
Avaiue, Carlsbad, CA 92008, contains an ampicillin resistance gene and may be 
transformed into E. coli strain DHIOB, available fiom Life Technologies. See, for 
instance, Clark, J. M., Nuc. Acids Res. 16:9677-9686 (1988) and Mead, D. et al.. 
Bio/Technology 9: (1991). 

[0064] The present invention also relates to the genes corresponding to SEQ ID 

NO:X, SEQ ID NO:Y, and/or the dgrasited clone (Clone ID NO:^. TTie corresponding 
gene can be isolated in accordance with known methods using the sequence information 
disclosed harein. Such mettiods mclude preparing probes or primers from the disclospd 
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sequence and identifying or amplifying the corresponding gene from appropriate sources 
of genomic material. 



species homologs. Procedures known in the art can be used to obtain full-length genes, 
allelic variants, splice variants, full-length coding portions, orthologs, and/or species 
homologs of ovarian associated genes corresponding to SEQ ID NO:X or the complement 
thereof, polypeptides encoded by SEQ ID NO:X or the complement thereof and/or the 
cDNA contained in Clone ID NO:Z, using information from the sequences disclosed 
herein or the clones deposited with the ATCC. For example, allelic variants and/or species 
homologs may be isolated and identified by making suitable probes or primers from the 
sequences provided herein and screening a suitable nucleic acid source for allelic variants 
and/or the desired homologue. 

f00661 The polypeptides of the invention can be prepared in any suitable manner. 

Such polypeptides include isolated naturally occurring polypeptides, recombinantly 
produced polypeptides, synthetically produced polypeptides, or polypeptides produced by 
a combination of these methods. Means for preparing such polypeptides are well 
imderstood in the art. 

[0067] The polypeptides may be in the form of the secreted protein, including the 

mature form, or may be a part of a larger protein, such as a fusion protein (see below). It 
is often advantageous to mclude an additional amino acid sequence which contains 
secretory or leader sequences, pro-sequences, sequences which aid in purification, such as 
multiple histidine residues, or an additional sequence for stability during recombinant 
production. 

[00681 The polypeptides of the present invention are preferably provided in an 

isolated form, and preferably are substantially purified. A recombinantly produced 
version of a polypeptide, including the secreted polypeptide, can be substantially purified 
using techniques described herein or otherwise known in the art, such as, for example, by 
the one-step method described in Smith and Johnson, Gene 67:31-40 (1988). 
Polypeptides of the invention also can be purified fiiom natural, synthetic or recombinant 
sources using techniques described herein or otherwise known in the art, such as, for 



[00651 



Also provided in the present invention are allelic variants, orthologs, and/or 
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example, antibodies of the invention raised against the ovarian polypeptides of the present 
invention in methods which are well known in the art. 

[0069] The present invention provides a polynucleotide comprising, or 

alternatively consisting of, the nucleic acid sequence of SEQ E) NO:X, and/or the cDNA 
sequence contained in Clone ID NO:Z. The present invention also provides a polypeptide 
comprising, or altematively, consisting of, the polypeptide sequence of SEQ ID NO:Y, a 
polypeptide encoded by SEQ ID NO:X or a complement thereof, and/or a polypqjtide 
encoded by tlie cDNA contained in Clone ID NO:Z. Polynucleotides encoding a 
polypeptide comprising, or altematively consisting of the polypeptide sequence of SEQ ID 
NO:Y, a polypeptide encoded by SEQ ID NO:X, and/or a polypeptide encoded by the 
cDNA contained in Clone ID NO:Z are also encompassed by the invention- The present 
invention further encompasses a polynucleotide comprising, or altematively consisting of, 
the complement of the nucleic acid sequence of SEQ ID NO:X, a nucleic acid sequence 
encoding a polypeptide encoded by the complement of the nucleic acid sequence of SEQ 
ID NO:X, and/or the cDNA contained in Clone ID NO:Z. 

[0070] Many polynucleotide sequences, such as EST sequences, are publicly 

available and accessible through sequence databases and may have been pubUcly available 
prior to conception of the present invention. Preferably, such related polynucleotides are 
specifically excluded from the scope of the present invention. Accordingly, for each 
contig sequence (SEQ ID NO:X) listed in the third column of Table 1, preferably excluded 
are one or more polynucleotides comprising a nucleotide sequence described by the 
general formula of a-b, where a is any integer between 1 and the final nucleotide minus 
15 of SEQ ID NO:X, b is an integer of 15 to the final nucleotide of SEQ ID NO:X, where 
both a and b correspond to the positions of nucleotide residues shown in SEQ ID NOiXj 
and where b is greater than or equal to a + 14. More specifically, preferably excluded are 
one or more polynucleotides comprising a nucleotide sequence described by the general 
formula of a-b, where a and b are integers as defined in columns 4 and 5, respectively, of 
Table 3. In specific embodiments, the polynucleotides of the invention do not consist of at 
least one, two, three, four, five, ten^ or more of the specific polynucleotide sequences 
referenced by the Genbank Accession No. as disclosed in column 6 of Table 3. In fiirther 
embodiments, preferably excluded from the invention are the specific polynucleotide 
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sequence(s) contained in the clones corresponding to at least one, two, three, four, five, 
ten, or more of the available material having the accession numbers identified in the sixth 
column of this Table. In no way is this Hsting meant to encompass all of the sequences 
which may be excluded by the general formula, it is just a representative example. All 
references available through these accessions are hereby incorporated by reference in their 
entirety. 
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Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 
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